Using RStudio on Hoffman2

Hoffman2 Happy Hour

Charles Peterson

Welcom to Hoffman2 Happy Hours

🎉 Welcome to the Hoffman2 Happy Hours

  • Short presentations on topics related to HPC and Hoffman2
  • Thoughts for future “Happy Hour” topics? 💡

📧 cpeterson@oarc.ucla.edu

Files For This Presentation

This presentation can be found on our github page

https://github.com/ucla-oarc-hpc/H2HH_rstudio


The html slides can be found at

https://ucla-oarc-hpc.github.io/H2HH_rstudio


More information and scripts on using RStudio on Hoffman2

https://github.com/ucla-oarc-hpc/H2-RStudio

RStudio Information

What Is RStudio

RStudio is a great IDE for R and visualize files.


But why do you want to use RStudio on Hoffman2 when you can use your own computer???

RStudio on Hoffman2 provides access:

  • higher memory
  • multi-core
  • GPUs
  • Your data on Hoffman2

RStudio Formats

There are two main (free) RStudio formats that researchers can use


RStudio Desktop

  • Standalone desktop application

  • Installed locally on your machine

RStudio Server

  • Run RStudio as a server process
  • Open on a web browser

RStudio on Hoffman2

RStudio Desktop can be inefficient on Hoffman2

  • require X11 forwarding
  • slow
  • sluggish interaction

RStudio Server is the best way to use RStudio on Hoffman2

Running RStudio

Running RStudio (1)

Get An Interactive Job

Containers cannot run on login nodes.

  • You MUST use a compute node


qrsh -l h_data=10G

Modify the qrsh to meet your RStudio computing needs

  • More memory and/or job time
qrsh -l h_data=50G,h_rt=5:00:00
  • More cores
qrsh -l h_data=10G -pe shared 10
  • Using GPUs
qrsh -l h_data=10G,gpu,V100

Running RStudio (2)

Create Temp Directories

  • Create writable temp directories
    • RStudio writes small files
    • Anywhere you have write access



mkdir -pv $SCRATCH/rstudiotmp/var/lib
mkdir -pv $SCRATCH/rstudiotmp/var/run
mkdir -pv $SCRATCH/rstudiotmp/tmp

Running RStudio (3)

Load the Apptainer Module

  • Apptainer is software that will run the Rstudio container


module load apptainer

RStudio Server on Hoffman2 created from Docker

  • Separate R from modules on Hoffman2
    • DO NOT load R modules
    • R packages may need to be reinstalled

Running RStudio (4)

Start Up RStudio

apptainer run \
 -B $SCRATCH/rstudiotmp/var/lib:/var/lib/rstudio-server \
 -B $SCRATCH/rstudiotmp/var/run:/var/run/rstudio-server \
 -B $SCRATCH/rstudiotmp/tmp:/tmp \
 $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif
  • apptainer run
    • Starts the RStudio container
  • -B $SCRATCH/rstudiotmp/[dir]:[/dir]
    • Mounts tmp directories to the container
  • $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif
    • Location of RStudio container
    • Can be change to different RStudio versions
  • Information will display about RStudio session
    • Note the compute node name and port number.
    • Displays ssh -N -L ... info to be ran

Note

KEEP THIS TERMINAL OPEN UNTIL YOU JOB IS DONE

Running RStudio (5)

  • Open another terminal on your local computer

  • Run the port forward command

    • Creates a connection from local computer to compute node
ssh  -N -L 8787:nXXX:8787 username@hoffman2.idre.ucla.edu 
  • Change port 8787 if needed
  • nXXX is the compute node name
  • username is your Hoffman2 username

Running RStudio (6)

  • Finally, open a web browser
    • Type URL of RStudio Server
    • Will ALWAYS be localhost
    • Change port 8787 if needed
http://localhost:8787

Running Rstudio - The Easy Way

  • h2_rstudio.sh
    • Script that runs everything from the previous slide
    • Starts Rstudio and opens a web browser for you
    • Runs on your local computer (not Hoffman2)

h2-studio.sh Information

Look at our Github page

  • Download script
wget https://raw.githubusercontent.com/ucla-oarc-hpc/H2-RStudio/main/h2_rstudio.sh
chmod +x h2_rstudio.sh
  • To display how to use this script
./h2_rstudio.sh -h
  • Run script
    • Replace username with Hoffman2 username
./h2_rstudio.sh -u username

Tested Platforms

Mac’s terminal app

Window’s WSL2

MoboXterm

GitBash

RStudio Script

This RStudio Script is currently on our GitHub page

Info on this RStudio Container (1)

  • Rstudio container was built using Docker
    • Based on RStudio images from the Rocker Project
    • Hoffman2 containers located at $H2_CONTAINER_LOC
    • RStudio containers are named:
      • h2-rstudio_X.Y.Z.sif
      • Where X.Y.Z is the R version
  • View all available RStudio containers by running
module load apptainer
ls $H2_CONTAINER_LOC/h2-rstudio*sif

Info on this RStudio Container (2)

  • Separate build of R and
  • R packages installed in unique directory
    • ~/R/APPTAINER/h2-rstudio_4.1.0 (for h2_rstudio-4.1.0.sif)
  • HPC Container files
    • Docker and definition files for Hoffman2 containers
    • RStudio Dockerfiles have all you need to build RStudio

R Package Installs

  • Some R packages require extra libraries or software in the container
  • Contact us to update this container
    • OR you can modify the Dockerfile for your own container

Tips for Running RStudio (1)

  • If Rstudio does not at start up
    • Possibly due to previous RStudio not shutdown correctly
  • Clear out any tmp directories config files
rm -rf $SCRATCH/rstudiotmp
  • Clear out RStudio config files
rm -rf ~/.config/rstudio

Tips for Running RStudio (2)

  • Access to a Hoffman2 terminal in RStudio

Using Batch R

  • Instead of interactive RStudio, you can run R as a non-interactive batch job
    • Use R from inside RStudio container as a qsub job
  • Create a job script
    • Load Apptainer
    • use RStudio container with a .R script
      • apptainer exec
#!/bin/bash
#$ -cwd
#$ -o rstudio_batch.out.$JOB_ID
#$ -j y
#$ -l h_rt=3:00:00,h_data=10G
#$ -pe shared 1

# Load Apptainer module
. /u/local/Modules/default/init/modules.sh
module load apptainer

# Run R with a R script, named myRtest.R
apptainer exec $H2_CONTAINER_LOC/h2-rstudio_4.1.0.sif R CMD BATCH myRtest.R
  • Then run this job script
qsub rstudio_batch.job

Summary

  • Utilize RStudio Server on Hoffman2
    • Access through on your web browser
    • Applicable to other HPC resource as well
  • RStudio can be used interactively or as a non-interactive batch job
  • Use the h2_rstudio.sh script for easy setup

Thanks and Happy Computing!

Questions? Comments?